(bug 43799) create language-specific collations for category sorting
This allows one to *finally* get articles to be correctly sorted on
category pages for 67 languages based in latin, greek and cyrillic
alphabets.
Fixes bug 29788, bug 41040, and bug 42412 (implementing collations for
Swedish, Polish, Ukrainian).
Full list of language codes this adds support for: af, ast, az, be,
bg, br, bs, ca, co, cs, cy, da, de, dsb, el, en, eo, es, et, eu, fi,
fo, fr, fur, fy, ga, gd, gl, hr, hsb, hu, is, it, kk, kl, ku, ky, la,
lb, lt, lv, mk, mo, mt, nl, no, oc, pl, pt, rm, ro, ru, rup, sco, sk,
sl, smn, sq, sr, sv, tk, tl, tr, tt, uk, uz, vi.
* Include data about first-letter characters for 67 language
tailorings. This data was generated from based on
http://developer.mimer.com/charts/tailorings.htm by a Ruby script
(https://www.mediawiki.org/wiki/User:Matma_Rex/generateCollationTailoringData.rb),
then adjusted by hand (removed duplicate definitions for Spanish and
German, changed code fil -> tl (Filipino -> Tagalog).
* Mark languages verified by native speakers (currently only pl
(Polish) I verified by myself and fi (Finnish) checked by Niklas).
* Allow for collations named like 'uca-<langcode>', mapping them to
IcuCollation with appropriate parameter. The code doesn't check if
we actually have data for given language, as it's checked after the
IcuCollation class instance is constructed.
* Add the tailoring data to the default first-letter file (for root
collation) before it's cached for given locale.
Change-Id: I838484b9aaf23945fe7880fef2e3da5f5c06877f